familypediawikiaorg-20200214-history
Forum:Google rank
Category:promotionCategory:GEDCOM use Introduction I just googled Charlemagne. We're not in the top 100. I googled Charlemagne+family. We're not in the top 100. I googled Charlemagne+genealogy. We're numbers 2 and 3. So, the good news is that genealogists would find us. I would think, though, that more people would search on family than on genealogy. That suggests that we should more often use the word family. rtol 19:07, 24 May 2009 (UTC) What is the advantage to be found ? So long we don't have a good possibility to upload gedcoms, to search for duplicates and to merge duplicates, so long genealogist with bigger databases are not interested in this site ! At first we have to prepare good possibilities, after that we can succeed to grow. First quality than quantity ! Fred Bergman 20:23, 24 May 2009 (UTC) : Google algorithms generally work on the principle of inbound links, not particular terms used. The more people link to us, the higher our rank will get. The way to get people to link to us is to get them to like us. People will like us because we are free, collaborative and relatively easy to add rich content to. We are not comprehensive yet, but I am working on the foundations that will support the massive weight that will be added in the coming months. Why don't want to have a repeat of what happened at Pisa. :It is certainly trivial to dust off the shovel and sling hoards of content into Familypedia. I think you are right that we want to do that without sacrificing quality. It seems to me that we need an assistance tool that allows people to confirm whether a new Gedcom individual is an identical or not. To this end, Fred, I am looking at AWB (nl article). Those of you who are admins can try it out now on our articles (anyone else will be able to use it too, but will need to get approval at wikia central first). I have assembled a cryptic set of instructions here. (Please feel free to make improvements, or move the "how to" portion of the setup instructions to the Genealogy:AutoWikiBrowser page). As it is, it will be able to help people make global search and replace type fixes to familypedia, so folks may want to start working with it now. This is an extensible tool and I am able to compile my own versions of it along with plugin modules on my machine. If I can maintain my focus on this project, I envision adding gedcom input to AWB. The idea is that a familypedia contributor would be able to make sure the article was reasonable before adding in the same way that AWB currently works. In this way we will be able to avoid the problems on other genealogy sites with shoveled content. We don't want obvious clones of other individuals, no mangling of critical data like dates, locations or names) before accepting it. Our competitors can't do it because it requires human eyeballs on these problems they don't have dedicated collaborators. We do. That is why familypedia will crush the opposition. So fear not, brave knight. We shall prevail. -[[User:Phlox|'~'' Phlox']] 16:28, 25 May 2009 (UTC) ::True and not. Our page on Charlemagne did not mention the word "family" at all, so a search for Charlemagne's family does not lead to Familypedia. Links in are indeed very important, but then we first need to offer something worth linking to. rtol 17:37, 25 May 2009 (UTC) :Thank you, Richard, for reintroducing the subject and for stressing the value of the word "family". Thank you, Fred and Phlox, for the cautions. Thank you, Phlox, for all the illustrations and pointers. But I feel that crushing the opposition should not be among our aims. — Robin Patterson (Talk) 04:11, 26 May 2009 (UTC) :I agree with Robin. It looks like television, trying to get most viewers to get highest advertisement income! If you want to make the content attractive for the outsiders than you must consider to show at the homepage starting-points for representative and attractive pedigrees and trees! Fred Bergman 07:12, 26 May 2009 (UTC) ::Fred, making the main page attractive with good jumping off points would be a good step forward. As for whether this will directly affect google ranking, google is not significantly impacted by what is on the home page or what words you use. For more info, consider the following article. Nl article on PageRank. The en article has more of the detail and math theory involved. ::The goal is to make familypedia one of the foremost genealogy sites to link to. That is not the case now. I don't think we have to be like television to make that so. If folks here aren't motivated by competitive talk in aspiring to that "foremost genealogy site" status, then fine, choose another metaphor. But we must do much better to raise our quality up a notch. -[[User:Phlox|'~'' Phlox']] 08:25, 26 May 2009 (UTC) Goals We are talking re different goals, or not ? I thought that we want to have at first a real genealogic site; secondly we want as many contributors as possible. Phlox and Richard Tol are working hard to introduce real genealogic tools; Richard is also working hard to make a attractive site with good contents of medieval ancestors, nobles and royalty. Robin tries to coordinate and to drive in the right direction. I am just a consumer contributor. We never get good contributors without good genealogic tools, so that is most important. After having a real genealogic site we need to get the good contributors. Via google we can have their attention. but when we have their attention and they come looking to us, then the site and the first sight at the site, the homepage, must be representative and give an example what here is and the best here is is I think are the results of the work of Richard. A good starting point for a tree is Charlemagne and a good starting point for a pedigree can be Willem Alexander of Oranje Nassau. But first things first! Fred Bergman 16:44, 26 May 2009 (UTC) Interest from Dutch group This moment a group of Dutch genealogists are looking for a new possibility to make a genealogic database for a large number of dutch genealogists to cooperate. Richard and I tried these persons to get to Familypedia but they have the opinion that Familypedia is not good enough. They want certain tools and possibilities that Familypedia has not, and they want own parts for each contributor apart from the common base. Fred Bergman 16:44, 26 May 2009 (UTC) :On the Dutch crowd, what they really wanted is a wiki with proper database functions. We have that now. rtol 17:27, 26 May 2009 (UTC) ::I'm curious about the meaning of functions here. Are you referring to semantic stuff, like properties? --Borgsteede 11:32, 30 May 2009 (UTC) ::heb je gelezen: Wie doet mee met 1 gezamenlijke genea database waarbij de aanknopingspunten samengevoegd worden ? ik heb het op de talk page van Robin gezet, hoop dat zijn passieve kennis van nederlands voldoende is. Fred Bergman 21:32, 26 May 2009 (UTC) :::Sorry, guys, I looked at Fred's correspondence copy, but my Dutch is not up to that. I can get a phrase or two and several isolated words from an average sentence, not the whole thing. I trust that rtol has been able to look at it. — Robin Patterson (Talk) 06:18, 27 May 2009 (UTC) ::::Robin, you disappoint me. Essentially, they are bitching about whether they want to reinvent the wheel and whether it would be a square or a triangular wheel. rtol 07:02, 27 May 2009 (UTC) :::::I presume that they are as thick-skinned as you and Fred are!!! I hope we can have several of them joining this discussion. We seem to have one today. — Robin Patterson (Talk) 13:24, 30 May 2009 (UTC) Misunderstanding If what they want is to own part of the site, then a Wiki by its very nature will never be good enough, no matter the tools. William Allen Shade 01:26, 27 May 2009 (UTC) :They don't want to own a site, they want a real genealogic site, but with safety tools and this site hasn't. You are able here easy to make mistakes and the site doesn't give warnings. There are a lot of possibilities to make this site better, this site is not able yet to compete with real genealogic sites. :But the real genealogic sites doesn't have one connected common tree and pedigrees, this site has. :The best genealogical wiki site is WeRelate.org, but the management there is not reliable and working God based. For that reason the dutch want the WeRelate system en they want to improve that. Fred Bergman 05:49, 27 May 2009 (UTC) ::Mistakes? - heading below ::I think that if they really want the WeRelate system ... heading below ::Interesting that neither we nor WeRelate is in the ProGenealogists' "50 Most Popular Genealogy Websites for 2009". Bottom of that list is a site with 50,000 genealogical links. — Robin Patterson (Talk) 13:24, 30 May 2009 (UTC) Own parts for each contributor? that sounds like something that MediaWiki allows if management approves - something about restricting edits of your user page and its subpages. — Robin Patterson (Talk) 06:23, 27 May 2009 (UTC) :some of these dutch are conservative, and me too, they want have the possibility every moment they want to download their own data and preserve their own data without anyone else to change these data, that they call their own territory. Besides that they give their data to the common tree and cooperate there. :At this moment I have two sites for this purpose, my data are for me alone at http://www.geneaweb.org/bergsmit and I cooperate here. Fred Bergman 06:46, 27 May 2009 (UTC) ::Not a problem. Anyone can download any or all of Familypedia at any time. See . — Robin Patterson (Talk) 03:16, 30 May 2009 (UTC) :::No, that's not what I mean. When you use that export, you get wiki pages, but most genealogists are way more interested in exporting GEDCOM's, i.e. structured genealogical data. WeRelate can do that, because it stores all user contributions in a true genealogical database, with persons, families, places, sources, and so on.--Borgsteede 11:05, 30 May 2009 (UTC) Mistakes? Mistakes? On a wiki any mistake can be corrected unless it's maybe a very cunning plan by a rebellious admin. (This site does give some warnings if you let it, e.g. saving with no edit summary.) — Robin Patterson (Talk) 06:18, 27 May 2009 (UTC) ::There are more than 6,900,000 persons in more than 2300 trees and pedigrees at the dutch Genealogieonline. It is impossible to control all these contributors, even if only 10% wants to deal with us, if they are uploading their gedcoms if you don't do that automaticaly via software ! Fred Bergman 06:28, 27 May 2009 (UTC) It's under GFDL I think that if they really want the WeRelate system they can have it. It's under GFDL. — Robin Patterson (Talk) 06:18, 27 May 2009 (UTC) So can we, probably more easily than the Dutch group can, because we are proficient at appropriate copying and adapting of GFDL material. The Dutch group could join us and incorporate any GFDL material they like from WeRelate, if they find that we don't have anything as good. — Robin Patterson (Talk) 02:49, 30 May 2009 (UTC) :The wiki is under GFDL, but the improvements of Dallan Quass are not wiki, are these also available ? Fred Bergman 06:28, 27 May 2009 (UTC) ::As the wiki is under GFDL, anything not available under GFDL would have to specify that. Can you find out exactly which parts of WeRelate your people are interested in? — Robin Patterson (Talk) 03:29, 30 May 2009 (UTC) Dutchies It is my understanding that the Dutch crowd wants #Wiki #Database #Multilinguality #Import and export of GEDCOM Familypedia offers 1, 2 and 3. 4 is technically not that difficult and will probably be added in the near future. If that satisfies the Dutchies, they'd be nuts to replicate the efforts here or elsewhere. rtol 11:26, 30 May 2009 (UTC) :Right ... Now let's see whether we really are talking about the same things here. #When I say database, I mean a set of tables that can hold a couple of million persons, family relations, and events. This is where you store the data that's imported from those GEDCOM's. The millions refer to the current size of the Genealogie Online site as listed by Fred above. #On Genealogie Online the average GEDCOM contains about 3000 persons. Some of the biggest have more than 100 000. The site can import the average GEDCOM within 15 minutes. Can you do that? :Now, you may ask why I'm asking for that database. Well, the main reason is that when you import a GEDCOM you can hardly avoid the creation of tens or even hundreds of duplicate persons, i.e. persons that already exist somewhere else on the site. If you force users to merge all of them during the import process, there's a big chance that people will rush through the process, and either create false merges, or leave way too many duplicates. That's why I think it's better to scan the site for possible duplicates, and review those afterwards. This has the added advantage that other users can contribute to the merging process. :Now, when someone merges two persons and their families, you can not expect that the average user merges the contents of their wiki pages manually. It doesn't only take way too much time, but it's also very prone to errors. That's why I think that you need a database to create a new page for the merged family automatically. :In theory, you can do all of the above using names and dates extracted from the wiki pages, provided that they are properly marked as properties. In practice, I don't believe that it will work, because the wikia servers simply can't handle the load generated by retrieving all those pages from the mediawiki database. A true database, which can hold an indexed list of persons ordered by surname and other properties, can achieve the same result much faster. --Borgsteede 15:44, 30 May 2009 (UTC) ::Your statements are misinformed. This site uses the same wiki software as wikipedia and if what you say is true, we would not be able to handle millions of records and thousands of queries per minute. But wikipedia has been handily doing this for such volumes for many years. The backend for it relies on MySQL. So take a look at wikipedia before making further estimations of scaling issues. You may be interested to know that Wikipedia processes tens of thousands of transactions per hour in a replicated fashion across multiple servers with robust backup and recovery. Further, the text handling is fully unicode enabled. ::This wiki can retrieve an indexed list of persons ordered by surname and other properties, because that is exactly how they are stored- as queriable database relations. Take a look at the #ask functionality of semantic mediawiki if you are interested in how to do this, and to gain further understanding of the powerful set operations and range querying capabilities. ::Any other genealogy site on the web is a toy compared to what this wiki provides in shear power of database robustness. It is a wiki though, and you will not be able to directly create sql queries, but you can do this indirectly with operations on multivalued properties. :: As an example of what is possible, normalizing time and place coordinates are key issues for any genealogy software that hopes to achieve very large scales. Names are recorded in multiple variants for different languages and placename organizations that have varied over hundreds of years. At the trivial level of database software, because these coordinates are stored as numerics, implementing a proximity search for a subject individual within a given geographic scope is possible. Anyone with knowledge of sql knows that this sort of thing is technically simple because all you do is a range comparison on the latitude and longitude stored as numerics. The big problem is not the database query at all, but whether you have the immense amount of knowledge necessary for mapping local names into coordinates. It might seem unrealistic that this would be achievable, but it could if large numbers of coordinates were manually input across the globe tied to disambiguated placenames described in multiple languages. This is precisely what wikipedia has done, and continues to expand their already substantial coverage of. Because we use the same software it is possible to use it as a source for this sort of knowledge. This is an example of how partnership with Wikipedia as a knowledge source provides substantial ongoing advantage for Familypedia. ::So arguably, conventional genealogy sites are using more primitive database technology, not the other way around. ::Now to the issue of merging. There are various schools of thought- basically pay now or pay later. The pay later sales pitch has strong appeal for the usual reasons that our debt ridden society is all too familiar with. The philosophy is to open the floodgates and then sort through the mess later. The key items that enthusiasts of this approach gloss over are the details of how that precisely that "pay later" clean up process would actually work. They point to the database software as the source of this magic. The specifics of how exactly the software decides that a person with an well known variant spelling born in a place that happens to use a district name rather than a village name for the same location, at a time similar enough is actually the same person. Looking at the engineering practicalities of identifying whether you have a match or not, one wakes up to the reality that people are only waving the term "database" around as if it is a magic wand that will make these tough problems disappear. We don't yet have the natural language, knowledgebase or probablistic software services necessary for that task. But the database administrator can proudly point to the fact that they have tens of millions of records. It may all be junk that no one has any motivation to clean up ever, but they have a big number that looks impressive. Sure. ::The other school of thought is to pay now. If you need human eyeballs on the problem, you need to figure how to leverage the immense numbers of people on the internet with interest in their ancestors to invest some time. A contributor doesn't think it is drudgery to confirm whether a gedcom record matches up with a presented similar record, because they are motivated- making that match gets them a step closer to understanding who their ancestors were. Such an approach for Gedcom merge is implemented on ancestry.com today. The advantage of familypedia over ancestry.com's merge approach is that we are a free site and can draw on a larger audience of collaborators. -[[User:Phlox|'~'' Phlox']] 17:10, 30 May 2009 (UTC) :::I'm with Phlox on this one. Our database is more powerful then GEDCOM-based stuff. GEDCOM may have the advantage of legacy, but the basic design is decades out of date. I've looked at many GEDCOM, and most of them are rubbish. Repairing the data is often more effort than starting from scratch. That said, we should have an easy option to import GEDCOMs (in lieu of the awkward option that we have already), but having Familypedia flooded with bad data is not something I would welcome. rtol 17:20, 30 May 2009 (UTC) ::::Well, when I read all this, I see no real advantage over WeRelate.org, which I already use, so I will stay there for now. Good luck. :::::Suit yourself. People didn't see the advantage of Wikipedia in 2002, and the vision seemed grandiose at the time. Cross leveraging of collaboratively determined knowledge has immense power, and that is the essence that you have missed in your analysis. WeRelate will be part of the past. If you wish to be part of the future, contribute here. -[[User:Phlox|'~'' Phlox']] 15:26, 31 May 2009 (UTC) ::::::Most people would not recognise the future if it bit them in the face. That said, while our technology is now superior, our content is not. And most people would judge us on content first. While GEDCOM compatability should come, I'd prefer to try Robin's idea first: copy all genealogical content from Wikipedia to Familypedia. rtol 17:52, 31 May 2009 (UTC) :::::::Robin is absolutely correct about focusing on Wikipedia genealogical content first. It is very high quality, has disambiguated locations, is already multilingual and forms a matrix of notable people that people want to form links in their genealogies to. -[[User:Phlox|'~'' Phlox']] 17:11, 2 June 2009 (UTC) SLOW Why is this site so slow nowadays ? Fred Bergman 16:01, 31 May 2009 (UTC) :Just in case you were wondering, it has nothing to do with the SMW stuff. Actually the load from SMW page is about 1/50 that of an article based on info pages. This can be verified by looking at the expensive parser functions. If curious, use your browser to display the source of the article you are interested in (eg foxfire menu View.. Page source). Search for NewPP. This is where the wikimedia engine records how hard it worked to display a page. The critical number is the expensive parser functions value. Pages like Geer that use no Info page features have very low numbers like 4 out of 500 allowed: NewPP limit report Preprocessor node count: 2777/1000000 Post-expand include size: 32257/2097152 bytes Template argument size: 12150/2097152 bytes Expensive parser function count: 4/500 :Compare this to Wilhelmine von Preussen (1774-1837) with an expensive count of 238: NewPP limit report Preprocessor node count: 45501/1000000 Post-expand include size: 203174/2097152 bytes Template argument size: 112227/2097152 bytes Expensive parser function count: 238/500 ::Periodically wikia has hiccups on their servers and it has nothing to do with how difficult our articles are to render.-[[User:Phlox|'~'' Phlox']] 17:18, 2 June 2009 (UTC) Update Pages I added a few days ago now show up on top at Google. So, we've been prioritised for updates and ranks. rtol 08:30, May 4, 2010 (UTC) The following amused me. I used my search engine (based on Yahoo, I think). "Charlemagne" - we're not in top 100. "Charlemagne family history" we're not there either. But "Charlemagne genealogy"? - well, yes, hit number 39, this: Forum:Google rank - Familypedia We're not in the top 100. I googled Charlemagne+genealogy. ... I would think, though, that more people would search on family than on genealogy. ... familypedia.wikia.com — Robin Patterson (Talk) 05:49, June 22, 2010 (UTC) The other day I wondered whether search engines would make anything of a page that seems to have no source code except " ". This is part of the answer, hit 9 of a search I did a few minutes ago: :Charlotte de Bourbon (1547-1582)/sensor - Familypedia :Update/Create Sensor page ... Karoline von Braunschweig-Wolfenbüttel (1764-1788), 9) +, :(Wilhelm I. von Württemberg ... :familypedia.wikia.com I think that's good news. — Robin Patterson (Talk) 14:55, July 3, 2010 (UTC)